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Abstract — In this paper, we study the problem of recovering 
a low-rank matrix (the principal components) from a high- 
dimensional data matrix despite both small entry-wise noise 
and gross sparse errors. Recently, it has been shown that a 
convex program, named Principal Component Pursuit (PCP), can 
recover the low-rank matrix when the data matrix is corrupted 
by gross sparse errors. We further prove that the solution to 
a related convex program (a relaxed PCP) gives an estimate of 
the low-rank matrix that is simultaneously stable to small entry- 
wise noise and robust to gross sparse errors. More precisely, 
our result shows that the proposed convex program recovers the 
low-rank matrix even though a positive fraction of its entries 
are arbitrarily corrupted, with an error bound proportional to 
the noise level. We present simulation results to support our 
result and demonstrate that the new convex program accurately 
recovers the principal components (the low-rank matrix) under 
quite broad conditions. To our knowledge, this is the first result 
that shows the classical Principal Component Analysis (PCA), 
optimal for small i.i.d. noise, can be made robust to gross sparse 
errors; or the first that shows the newly proposed PCP can be 
made stable to small entry-wise perturbations. 

I. Introduction 

The advance of modern information technologies has pro- 
duced tremendous amount of high-dimensional data in sci- 
ence, engineering, and society, such as images, videos, web 
documents, and bioinformatics data. It has become a pressing 
challenge to develop efficient and effective tools to pro- 
cess, analyze, and extract useful information from such high- 
dimensional data. One of the fundamental problems here is 
how to extract the intrinsic low-dimensional structure of such 
high-dimensional data. 

a) Classical Principal Component Analysis: Arguably, 
the classical Principal Component Analysis (PCA) [H, CD 
is the most widely used statistical tool for high-dimensional 
data analysis and dimensionality reduction today. It basically 
assumes that the data approximately lie on a low-dimensional 
linear subspace. Mathematically, if we stack all the data points 
as column vectors of a matrix M, then the matrix should be 
approximately low-rank and can be written as M = Lq + Zq, 
where L is a low-rank matrix (representing the subspace) and 
Zq models a small noisy perturbation of each entry of Lq. 
Then, PCA simply seeks the best rank-fc estimate of Lq in the 
£2 sense, which can be solved efficiently via singular value 
decomposition (SVD) and thresholding. It can be shown that 
if the perturbation is i.i.d. Gaussian, this gives a statistically 
optimal estimate of the subspace. Such an estimate is naturally 
stable in the sense that the error is bounded to be proportional 



to the magnitude of the perturbation. 

b) Robust PCA via Principal Component Pursuit: How- 
ever, it is well known that the classical PCA breaks down even 
with a single grossly corrupted entry in the data matrix M, i.e., 
it is not robust to gross errors or outliers. Many methods have 
been proposed to alleviate this problem, however, none of them 
yield a polynomial-time algorithm with strong performance 
guarantees (see J3) for a detailed discussion). 

The recently proposed Principal Component Pursuit (PCP) 
method utilizes a convex program that guarantees to recover a 
low-rank matrix despite gross sparse errors under rather broad 
conditions. Mathematically, it considers the matrix M of the 
form M = L + Sq, where L is low-rank and So is a sparse 
matrix with most of its entries being zero. Unlike the model for 
PCA, here both components can be of arbitrary magnitude and 
no other information about the rank of Lq and/or the support 
or signs of So is given. To recover Lq and So, PCP solves the 
following convex optimization problem^] 

min ||L||, + A||S||i subject to M = L + S. (1) 

It has been shown in [3 1, under surprisingly broad conditions, 
the above convex program exactly recovers Lq and So- Readers 
are also referred to |4| which proposed to solve the same 
problem but with different exact recovery conditions. 

c) Main Assumptions: Since our analysis and result will 
be largely based on the same conditions of PCP, for complete- 
ness, we summarize the precise conditions and result of PCP 
here. Let Lq = UHV* = X)I=i °~i u i v i denote the singular 
value decomposition of Lq £ ]R™ lX " 2 , where r is the rank, 
a 1 , . . . , ay are the singular values, and U — [ui , . . . , u r ] , V = 
[vi, . . . , v r ] are the matrices of left- and right-singular vectors, 
respectively. The incoherence conditions on U and V with 
parameter [i are as follows: 

max||[T e! || 2 < max ||K* ei || 2 < \\UV\U < J-^ 1 -, 

i Til i 712 V n l n 2 

(2) 

where e,'s are the canonical basis vectors. Now let ||So||o = 
m be the number of nonzero entries in So- The conditions 
on So concern the identifiability issue arises when So is also 
low-rank. To avoid such pathological cases, assumes that 
the support of sparse component So is selected uniformly at 
random among all subsets of size m. Under these conditions, 
the main result of states: 

'in this paper, we use five norms of a matrix A. \\A\\* denotes its nuclear 
norm - sum of its singular values, ||A||jr denotes its Frobenius norm and 
|]A|] denotes its 2-norm. Moreover, ||A||i and ||j4||oo are the l\ and £oo 
norms of A viewed as a vector, respectively. 



Theorem 1 ([3|). Suppose L e R nxn obeys (TJJ and that 
the support set of So is uniformly distributed. Then there is 
a numerical constant c such that with probability at least 
1 — cn (over the choice of support of So), Principal 
Component Pursuit ([TJ with A = \j\fn recovers Lq and Sq 
exactly, provided that 

rank(Lo) < p r np~ 1 (\ogn)~ 2 and m < p s n 2 , (3) 
where p r and p s are some positive constants. 

The analysis and result of PCP apply to any rectangular 
(ni x n 2 ) matrix, so will be the result of this paper. But to 
simplify presentation, we have assumed that the matrices are 
all square and write n = ri\ = n 2 . The modification needed 
for general rectangular matrices is straightforward and will be 
briefly discussed in the end of the paper. 

A. Main Result of This Paper 

The PCP result [ 3 1 is limited to the low -rank component be- 
ing exactly low-rank and the sparse component being exactly 
sparse. However, in real world applications the observations 
are often corrupted by noise, which may be stochastic or deter- 
ministic, affecting every entry of the data matrix. For example, 
in face recognition, the human face is not a strictly convex 
and Lambertian surface hence small perturbation accounting 
for the fact that the low-rank component is only approximately 
low-rank needs to be considered. In ranking and collaborative 
filtering, user's ratings could be noisy because of the lack 
of control in the data collection process. Therefore, for the 
techniques developed in [3| to be widely applicable, results 
that guarantee stable and accurate recovery in the presence of 
entry-wise noise must be established. 

The new measurement model that we consider in this paper 
assumes that we observe 

M = L + S + Z , (4) 
where Z is a noise term - say i.i.d. noise on each entry of the 
matrix. However, all we assume about Z in this paper is that 
I l^o I |f < $ f° r some S > 0. To recover the unknown matrices 
Lo and So, we propose solving the following optimization 
problem, as a relaxed version to PCP ([TJ: 

min ||L||* + A||5||i subject to ||M - L - S\\ F < 6. (5) 

where we choose A = 1/ \fn. Our main result is that under 
the same conditions as PCP, the above convex program gives 
a stable estimate of Lq and So: 

Theorem 2. Suppose again that Lo obeys (T2J and the support 
of So is uniformly distributed. Then if Lo and So satisfy ([3j 
with p ri Ps > being sufficiently small numerical constants, 
with high probability in the support of Sq, for any Z with 
||^o||f < 8, the solution (L,S) to the convex program |5j 
satisfies 

\\L-L \\ 2 F + \\S-So\\ 2 F <Cn 2 S 2 , (6) 
where C is a numerical constant. 

The precise form of the constant C will be given in 
Proposition |4] Here, we would like to point out two ways 
to view the significance of this result. To some extent, our 
model unifies the classical PCA and the robust PCA by 



considering both gross sparse errors and small entry-wise noise 
in the measurements. So on one hand, our result says that the 
low-rank and sparse decomposition via PCP is stable in the 
presence of small entry-wise noise, hence making PCP more 
widely applicable to practical problems where the low-rank 
structure is not exact. On the other hand, together with the 
result of PCP (3), our new result convincingly justifies that 
the classical PCA can now be made robust to sparse gross 
corruptions via certain convex programs. Since this convex 
program can be solved very efficiently 0, at a cost not so 
much higher than the classical PCA, our result is expected to 
have significant impact on many practical problems. 

B. Relations to Existing Work 

Aside from its close relations to the classical PCA and 
the newly proposed robust PCA work mentioned above, our 
analysis and result are closely related to two lines of develop- 
ment, regarding stable recovery of sparse signals and low-rank 
matrices, respectively. 

Conceptually, our work is very similar to the development 
of results for the "imperfect" scenarios in compressive sensing 
where the measurements are noisy and the signal is not exact 
sparse. More precisely, £i-norm minimization techniques are 
adapted to recover a vector xo £ K m from incomplete and 
contaminated observations y = Axq + z where A is a n x m 
matrix with n <C m and z is the noise term. After the landmark 
work of [6 1 which established that for the noise free case, the 
minimal £i-norm solution exactly recovers the sparse signal 
under fairly broad conditions, later works have demonstrated 
that stable recovery occurs for most measurement ensembles 
Q, or particularly, when the measurement ensembles satisfy 
some simple incoherence conditions (8) or restricted isometry 
property (RIP) 10. 

Recently, there has been an explosion of literature regarding 
the power of nuclear-norm minimization in recovering low- 
rank matrices from under-sampled measurements. A matrix 
RIP is first proposed by ITOl to connect compressive sensing 
with low-rank matrix recovery. For measurement ensembles 
obeying the RIP, tight bounds of the recovery error from noisy 
data have been developed in [ 1 1 1 which is within a constant 
of the minimax risk and an oracle error. Also see 02 1 f° r 
similar results. Technically, our work is more closely related 
to the recent work lTT~3l which developed the first stability result 
for the matrix completion problem under small perturbations. 
Naturally, in establishing the stability result for robust PCA, 
we borrow heavily from the techniques used in [ 1 3 1 and J3). 

II. Notation and Outline of Analysis 

Our goal is to show that in cases where the noise free 
principal component pursuit ([TJ exactly recovers (Lo, So), the 
noise aware version |5j stably estimates (Lo, So). In the noise 
free case, exact recovery is guaranteed by the existence of a 
"dual certificate" W described in Lemma [3] below. The main 
result of f3j is to show that under the stated conditions, with 
high probability such a dual certificate exists. Then Proposition 



|4] below shows that the existence of such a certificate actually 
also implies that the recovery via Q under noise is stable. 

Before continuing, we fix some notation. Given a matrix 
pair Xo = (Lq, So), let fi C [n] X [n] denote the support of 
So, and "Psi denote the projection operator onto the space of 
matrices supported on Q. Let r = rank(Lo). an d let Lq = 
UYiV* denote the compact singular value decomposition of 
L , with U, V G R nxr and E e R rxr . We will let T denote the 
subspace generated by matrices with the same column space 
or row space as Lq\ 

T = {UQ* + RV* \Q,Re R nxr } c R nxn , 
and Vt be the projection operator onto this subspace. 

For any pair X = (L,S) let \\X\\ F = (\\L\\% + USUI,) 1 / 2 , 
and define the projection operator V F X "Pq : (L, S) i->- 
(V T L,VnS). Define the subspaces T = {(Q,Q) | Q e 
R" x "} and T 1 - = {(Q,-Q) \ Q G R" xn }, and let 7> r and 
Vr± denote their respective projection operators. Finally, for 
any linear operator A : M" x " — > R nxra , we use ||^4|| to denote 
the operator norm supn^-ii =1 

With these notations, the optimality conditions for ([TJ can 
be stated in terms of a dual vector as follows. 

Lemma 3 (Lemma 2.5 in Q). Assume that \\VoPtW < 1/2 
and A < 1. Suppose that there exists W such that 
(WeT^, ||W||<l/2, 

< \\Vn{UV* - Asgn(So) + W)\\ F < A/4, (7) 
[\\V u x(UV* + W)\\ oc < \/2. 
Then the pair (L ,Sq) is the unique optimal solution to 

From now on, we will write XPqD = Vn(UV* — 
Asgn(So) + W). The following proposition shows that under 
the existence of such a dual certificate, |5} will also stably 
recover Lo and So in the presence of noise. 

Proposition 4. Assume \\VqPt\\ < 1/2, ^ < V 2 , an d that 
there exists a dual certificate W satisfying fQ. Let X = (L, S) 
be the solution to |5]l and Xq = (Lq, So), then X satisfies 

\\X -X\\ F < (8V5n + V2)5. (8) 

Propositionfflimplies Theorem|2] since under the conditions 
of Theorem [2] Lemma 2.8 and Lemma 2.9 of [ 3 1 show that 
with high probability, there indeed exists such a dual certificate 
W, and Corollary 2.7 of [3] proves WPnTrW < 1/2 as well. 

The rest of the paper then sets out to prove Proposition 
|4] and is organized as follows. In Section III we prove two 
key lemmas on which our main result depends. The proof of 
Proposition [4] then follows in Section IV We further provide 
numerical results in Section [V] to support our analysis and 
conclude the paper with additional discussions in Section [VT] 

III. Two Lemmas 

In this section, we prove two lemmas which will be useful 
in the development of our main result. For any matrix pair 
X = (L,S), we define = + A||5||i. 

Lemma 5. Assume \\VqVt || < 1/2 and A < 1/2. Suppose 
that there exists a dual certificate W satisfying (|7]l and write 
A = UV* + W. Then for any perturbation H = (Hl,Hs) 



obeying + Hg — 0, 
||*o+#Ho > ||Xob + (3/4-||7V(A)||)||7V(ffL)||* 
+(3A/4-||7V(A)||oc)||7V(#s)||i. 
Proof: For any Z = (Zi,Zg) € <9||X ||<>, we have 

ll*o + ff|lo > UoWo + {Zl,H l ) + {Z s ,H s ). 
Now due to the form of the subgradients of the l\ norm and the 
nuclear normj^jwe have the identities: Z^ = A+'P F ± (Z^—A) 
and Z s = A - XPqD + V n ± (Z s - A). Thus we have: 

(Z L ,H L ) + (Z S ,H S ) 

= (A,H L ) + (V T ±(Z L -A),H L ) 

+(A - XPnD, H s ) + {V a x (Z s ~ A),H S ) 
> (Z L -A,V T ±(H L )) 

+ (Z S - A,P n ±(H s )) - j\\Va{H s )\\F 
since H L +H S = Q and WPq.D\\ f < 1/4. 

Moreover, by duality, there exists Z* L £ <9||Lo||* with 
\\Z* L \\ < 1 such that (Z* L ,P T ±(H L )) = \\V T ±(H L )\\ m . 
Also notice that \(A,V T ±(H L )) \ = \(V T x(A),V T x(H L ))\ < 
\\P t ±(A)\\\\P t x(Hl)\\*. Therefore, let Z L = Z* L , we have: 

(Z L -A,V T ,{H L )) > (l-\\V T x(A)\\)\\V T ±(H L )\U. 
Similarly, by duality, there exists Z* s e 9(A||S ||i) with 
H^Hoo < A such that (Z* S ,V Q ±(H S )) = \\\V n x(H s )\\i. 
Therefore, choose Zs to be Zs — Z* s , we have: 

(Zs-A,V^{H S )) > (A-||^(A)||oo)||^(^s)||i. 
Observe now that 

\[Pa(H s )\\ F < \\VnP T (H s )\\ F + \\VaV T ±{H s )\\ F 

< \\\Hs\\f + \\V t ±{H s )\\ f 

1\\V^(Hs)\\f + \\V t ^H s )\\ f , 



< ^\\Va{H s )\\ F 
therefore, 

\[P n (H s )\\ F < \\V Q ±(H S )\\ F +2\\V T ±(H S )\\ F 
< \\V n i(Hs)\\i + 2\\V T x(H L )\U. 
Combining the inequalities above, we have 
||*o +H\\o > \\Xo\\o + (! - V2 - WPt±(A)\\)WPt±(H l )\\ 
+(A-A/4-||P n x(A)|| 00 )||P n x(ff s )|| 1 
> \\XoWo + (3/4-\\V T ±(A)\\)\\V T ±(H L )\U 
+(3A/4-||P n x(A)|| DO )||P n x(/f s )|| 1 . 



Lemma 6. Suppose that WVt'PqW < 1/2- Then for any pair 
X = (L,S), \\V r (V T x P a )(X)\\* F > \\\{V T x V Q )(X)\\%. 

Proof: For any matrix pair X' = {L',S'), V V {X') = 
H±sL i Ll±g^ and so \\V T (X')\\ F = \\\L' + S'\\ 2 F . So, 

\\V T {V T x Vn)(X)\\ F = \\\V T {L) +V n (S)\\ F 

= \ {WPt(D\\ f + \\Vn(S)\\ 2 F + 2 (V T (L),Vn(S))) . 

Now, 

(V T (L),Vn(S)) = (V T (L),(V T Vn)Vn(S)) 

> -\\VTVn\\\\V T (L)\\ F \\Vn(S)\\ F . 

2 That is, Z s = A(sgn(So) + F) with VqF = and HFHoo < 1; and 
Z L = UV* + W with P T W = and \\W'\\ < 1. 



Since \\P T V a \\ < 1/2, 
\\Pt{V t x V u ){X)f F 

> \ (\\W)\\ 2 F + \\Vn(S)\\ F - \\P T (L)\\ F \\Pn(S)\\ F ) 

> 3 (WPt(L)\\ 2 f + \\V a (S)\\%) = \\\{V T x Vn){X)f F , 
where we have used that for any a, b, a 2 + b 2 — ab > (a 2 + 

b 2 )/2. m 

IV. Proof of Proposition^] 

Our proof uses two crucial properties of X. First, since Xq 
is also a feasible solution to ([5j, we have ||Af||<> < 1 1 J^sTo 1 1 - 
Second, we use triangle inequality to get 
\\L + S-L -So\\ F 
< \\L + S - M\\ F + \\Lo + So - M\\ F < 2S. (9) 
Furthermore, set X = Xq + H where H = (Hl, H$) and 
write H r = V r {H), H r± = V r ±(H) for short. We want to 
bound 1 1 H \\ 2 F , which can be expanded as 



\H\\l 



\H V \\ 2 F 



1^ III 



= II^IIf + MPt x Tn)(H^)\\ 2 F + \\(P T x x V n ±)(H L 



\Vv{V T x Tn)(H r )f F > -\\{V T x V n ){H r 



2 

If' 



But since V r {H T ) = = V T {V T x Vn){H r )+V r (V T ±x 
V n ±)(H r± ), we have 

\\Vr{V T x Vn){H r± )\\ F = \\V r (V T ± x V n ±)(H r± )\\ F 
< \\(Tt± xV^)(H r± )\\ F . 



Combining the previous two inequalities, we have 

\\(V T x Vn)(H r± )\\ 2 F < 4\\(V T ± x V^){H T± )\ 
which, together with ( [13) , gives us the desired result, 



2 

F i 



< 64x5xnV. (14) 



V. Simulations 



(10) 

Since @) gives us \\H T \\ F = (\\(H L + H S )/2\\ 2 F + \\{H L + 

H S )/2\\ 2 F ) 1/2 < x 28 = y/25, it suffices to bound the 

second and third terms on the right-hand-side of ( flu) . 

a. Bound the third term of ( fTO) . Let W be a dual certificate 
satisfying Q. Then, A = UV* + W obeys ||P t _l(A)|| < 1/2 
and 1 1 TV (A) | |oo < A/2. We have 

WXo + HU^ WXo + H^U-W^U (11) 

and 

ll^o + ^ r± b 

> \\X U + (3/4-\\V T x(A)\\)\\V T ±(Ht)\U 
+ (3 A/4 - 1 1 TV (A) 1 1 oo ) 1 1 7V s X ) 1 1 1 

> H^ollo + \(\\VT^Ht)\U + A||^(^)lli), 
which implies that 

IPWffDll. + A ll^(ff| x )lli < Ml^u. (12) 

For any matrix Y £ R nxn , we have the following inequalities: 

II^IIf < ll^ll* < v^II^IIf, A=\\y\\ f < A||K||i < v^II^IIf, 

where we assume A = -4=. Therefore 

||(P T x X^)(i? r ")||F 

< WVt^H^Wf + \\Pn^(H^)\\F 

< \\V T x(Hl X )\U + XV^\\V n ±(Ht)h 

< AVnW^U = AVn(\\H r L \U +X\\H r s \\ 1 ) 

< 4n(\\Hl\\ F + \\H%\\ F ) = 4:V2n\\H r \\ F < 8n5, (13) 
where the last equation uses the fact that ij£ = Hg. 

b. Bound the second term of ( flu) . By Lemma |6] 



In this section, we run a series of numerical experiments 
on square matrices with noisy entries. For each setting of 
parameters, we report the average errors over 20 trials. Each 
entry of the noise term Zo is i.i.d. N(0,a 2 ). A rank-r matrix 
Lo is generated as Lo = UV* where both U and V are n x r 
matrices with i.i.d. A^(0,cr^) entries, with a 2 = 10^. Here, 
the value of a n is rather arbitrary and set such that the singular 
values of Lq are much larger than the singular values of Zq. 
The entries of Sq are independently distributed, each taking 
on value with probability 1 — p s , and uniformly distributed 
in [—5, 5] with probability p s . 

In order to stably recover X = (L, S), instead of directly 
2 solving (j5J, we solve the following dual problem, to which a 
fast proximal gradient algorithm proposed in Q, Accelerated 
Proximal Gradient (APG), can be applied. 

L\\, +\\\S\\i + ^ \\M-L-S\\%. (15) 



mm 

L,S 



2fi' 



It is well established that ( [13] ) is equivalent to Q for some 
value n(S). Our choice of fj, here follows similar arguments 
as in 1131 . First, note that if we fix S = in (|T5jl, the solution 
L of ( |T3] > is equal to the singular value thresholding version 
of M with threshold /i. Similarly, if we fix L — in ([T5j), the 
solution S is equal to the entry-wise shrinkage version of M 
with threshold /iA. Thus, we choose // to be the smallest value 
such that the minimizer of ( fl"5| l is likely to be L = S = 
if we set Lo = So = and M — Zq. In this way, // is 
large enough to threshold away the noise, but not too large 
to over-shrink the original matrices. Now, it is well known 
that for Z G M" x ™, n^^WZoW -> %/2cr almost surely as 
n — > oo. Thus, we choose fi = y/2na. This also fits the 
sparse component well since /iA = \[2o. We shall see that 
this choice of /x works well in practice. 

A. Comparison with An Oracle 

To further understand our algorithm, we would like to 
compare its performance to the best possible accuracy one 
can achieve, for instance, by the minimal mean-square-error 
(MMSE) estimator over all low-rank and sparse matrix pairs. 
However, because obtaining the MMSE estimation is not 
computationally tractable, we instead resort to an oracle which 
gives us information about the support 57 of So and the row 
and column spaces T of Lo- Our oracle estimates L and S as 
the solution L orac i e and S orac i e to the following least squares 
problem: 

min ||M-L-5||f subject to LeT,Sefl. (16) 

Since we know the locations of the corrupted entries, we can 
solve for L orac i e and S orac i e separately. That is, we first find 
the matrix in T which best fits the uncorrupted data in a 
least squares sense. Under the hypotheses of Theorem [4] the 



operator VtV^i-Vt is invertibl^jwhen restricted to T and the 
least squares solution is given by 

Loracle = {V T V Q ±Vt)~ 1 'P T V Q ± (M) . 

and the sparse component is given by 

^oracle — 7^< > I .W ^oracle) ■ 

B. Experiment Results and Analysis 

We first evaluate the performance of ( p"5j ) with matrix L 
whose rank r — 10 is fixed. We measure estimation errors 
using the root-mean-squared (RMS) error as \\L — Lo\\p/n, 
\\S — So\\f/ti for the low -rank component and the sparse 
component, respectively. Fig. [TJa) shows the RMS error with 
varying noise level a. In this experiment, the dimension 
n = 200 and the fraction of corrupted entries p s — 0.2 
are fixed. As predicted by our main result, the RMS error 
grows approximately linearly with the noise level. Moreover, 
the RMS error by solving |5]l is just about twice the RMS error 
achieved by the oracle introduced in the previous section. 

Now we fix a = 0.1. Fig. |TJb) and Fig. |2|a) show the 
results with varying p s (when n = 200 is fixed) and n (when 
p s =0.2 is fixed). Fig. |TJb) illustrates that one can achieve 
higher breakdown point by knowing ft and T. It is observed in 
[3 1 that when the rank r is fixed or grows sufficiently slowly as 
n increases, our method can recover more and more corrupted 
entries. Here in Fig. |2|a) we see a similar phenomenon. As 
n increases, the RMS error decreases given a fixed fraction 
of corrupted entries. That is, our approach can simultaneously 
tolerate a large fraction of corrupted entries and a high level 
of noise when the dimension n is sufficiently large. 

To further test the stability of ( fl"5j ), we examine how the 
algorithm performs when the rank of Lo grows in proportion 
to n and the fraction of errors in So grows in proportion to n 2 . 
More precisely, in Fig.[2jb) we fix u = 0.1, and plot the RMS 
error as a function of n, with rank(L ) = 0.1 x n and p s = 
0.1. The result clearly shows that our approach can recover a 
wide range of matrix pairs (Lo, So), m the presence of noise. 
Interestingly, these results also suggest that our analysis loses 
a factor of n with respect to the optimal bound. 

VI. Discussion 

In this paper, we only present the result for square matrices 
for simplicity. However, the arguments and results can be 
easily modified to handle the general case. For instance, when 
the matrices are rii x n 2 , let nm = max(ni,n2) and ri(2) — 
min(ni,ri2). The conclusion of Theorem [T] can be stated as: 
PCP with A = l/y/fI(Yy succeeds with probability at least 
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and TTi < p a n\n%. Also, relation |6} in Theorem [2] becomes 
\\L-L Q \\ F + \\S-S \\ F < Cn in2 S 2 . 

As suggested by the numerical results, one could hope to 
improve the stability result by removing the dependence on 
n. In this direction, we would like to point out that most of 
our analysis seems to be tight, except ( fT3j ) where we invoke 

In fact, since WPtVsiPtW = WPnVrW 2 < 1/4, the smallest eigenvalue 
of V T V n ±V T is bounded below by 1 - 1/4 = 3/4. 
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. 1. (a) RMS errors as a function of a with r = 10, p s = 0.2, n ■■ 
RMS errors as a function of p s with r = 10, <r = 0.1, n = 200. 
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Fig. 2. RMS errors as a function of n with (a) a = 0.1, ps = 0.2, r = 10 
fixed, (b) a = 0.1, p a = 0.1 and r = 0.1 X n growing in proportion to n. 

the generic relations between the nuclear norm, l\ norm and 
the Frobenius norm. Fully examination of this problem may 
require additional model assumptions. It is also very likely 
that some results in the geometry of Banach spaces, namely 
the spherical sections theorem and concentration of measure, 
will play a key role in it. 
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